Wu X, Zhang X. Automated inference on criminality using face images[J]. arXiv preprint arXiv:1611.04135, 2016: 4038-4052.

该篇论文利用supervised machine learning(logistic regression, KNN, SVM, CNN) 对criminal (C) 和non-criminal (N) 面部图像进行分类(准确度最高达到89.51%)，并进行一些实验分析C与N群体之间的区别：

N群体内部的面部相似度更大，C群体内部的面部差异更大。
C和N是两个concentric(同心), distinctive的manifold(流形).
The variation of C greater than N.

基于面部特征的人为判断会带有偏见、先决条件等，而CV算法并不会存在这些问题。

1. Data Preparation

Dataset包含1856张照片 (1126N+730C, Figure 1). 照片标准: Chinese, male, between ages of 18 and 55, no facial hair, no facial scars, or other markings.
N including waiters, construction workers, taxi and truck drivers, real estate agents, doctors, lawyers and professors; half have university degrees.
C including the ministry of public security of China, the departments of public security for the provinces of Guangdong, Jiangsu, Liaoning, etc. And the City police department in China.
C中 235人是violent crimes (murder, rape, assault, kidnap and robbery), 剩余536人是non-violent crimes (murder, rape, assault, kidnap and robbery).
Only the region of the face and upper neck is extracted.
80 × 80 images.
将每张图像的直方图与整个数据集的平均直方图相匹配，从而使得灰度图归一化到同样的强度分布。

2. Methods

面部关键点特征能够避免signal level和variant of source cameras的影响。论文使用以下四种关键点:
1. Facial landmark point.
2. Facial feature vector, generated by modular PCA.
3. Facial feature vector based on Local Binary Pattern (LBP) histograms.
4. The concatenation of above three feature vectors.

(Feature-driven classifiers (LR, SVM, KNN) 3 + Data-driven classifiers (CNN)) 10-fold cross validation = 130 cases

3. Results

不同的source camera拍摄的照片可能会带有不同camera的signatures, 虽然已通过上述的landmark point解决，但在此进一步引入高斯噪声 (mean=0) 来overpower camera signatures. 实验结果与期望的一致: 性能不会出现很大的变化 (Figure 6,7;Table 2, 3).